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Abstract 

Simulation studies that validate statistical techniques for fMRI data are challenging due to the complexity of the data. 
Therefore, it is not surprising that no common data generating process is available (i.e. several models can be found to 
model BOLD activation and noise). Based on a literature search, a database of simulation studies was compiled. The 
information in this database was analysed and critically evaluated focusing on the parameters in the simulation design, the 
adopted model to generate fMRI data, and on how the simulation studies are reported. Our literature analysis demonstrates 
that many fMRI simulation studies do not report a thorough experimental design and almost consistently ignore crucial 
knowledge on how fMRI data are acquired. Advice is provided on how the quality of fMRI simulation studies can be 
improved. 
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Introduction 

Twenty years ago, functional magnetic resonance imaging 
(fMRI) was established as a method to measure brain activity [1,2]. 
In these past twenty years, this technique has been used 
increasingly and has pioneered the search to map and connect 
the brain that has caused a world-wide collaboration of scientists 
from different disciplines. Engineers and physicists are intrigued by 
the acquisition of the fMRI data, while physicians and psychol- 
ogists are challenged to adapt their behavioural experimental 
protocols to the scanner environment. Last but not least, the 
analysis of fMRI data has been, and still is, a topic of numerous 
discussions among statisticians. The latter are confronted with the 
fact that the data acquired through fMRI have no ground truth. 
This ground truth is needed to ensure validation of the statistical 
methods that are used to analyse the data and to assess statistical 
properties such as sensitivity, specificity, bias and robustness. Great 
efforts to establish this ground truth have gone into the 
development of mechanical models [3], while direct measuring 
of the neural activity with intracranial EEG (iEEG) offers another 
solution [4]. However, for most studies iEEG may not be feasible 
and simulations may be the only realistic approach to establish the 
ground truth of fMRI data. 

Neurolmage, one of the flagship journals in the neuroimaging 
community, celebrated the 20th anniversary of the first fMRI 
publications with a special volume that consisted of 103 reviews 
about the early beginnings, developments in acquisition, software, 
processing and methodology, and prospectives for the future of 
fMRI [5] . Although the advances in statistical methods for fMRI 
data are discussed in several of these reviews, simulations an sich 
are not mentioned. In general, it appears that simulation studies 
are still not standard practice for fMRI methods validation. A 
possible explanation is that it can be quite challenging to simulate 
fMRI data. Not only is the coupling between the neural activity 
and the Blood Oxygenation Dependent Level (BOLD) not 
completely understood [6], fMRI data are also characterised by 



a great deal of noise coming from multiple sources [7]. 
Consequentiy, no common data generating process for fMRI 
data is available and the data generation in fMRI simulation 
studies is mostly defined ad hoc. 

The goal of this review is to provide an overview of the most 
common data generation methods used in fMRI simulation 
studies. An established and accepted data generating process does 
not yet exist and therefore an investigation of the existing 
published models is called for. In particular, the validity of these 
data generating methods is analysed and the overall reporting and 
conduct of fMRI simulation studies is critically reviewed. The rest 
of the paper is organised as follows: In the Methods section the 
article selection criteria are reported that were applied to establish 
a database of fMRI simulation studies literature, and the focus 
points of the article evaluation are discussed. The Results section 
focuses on different aspects of the simulation studies, namely, the 
goals of the studies, the experimental design under investigation, 
the simulation parameters and the data generation models. 
Finally, in the Discussion, best practice recommendations are 
provided to increase the reliability and generalisability of fMRI 
simulation studies. 

Materials and Methods 

Article selection 

Articles were selected from the Web of Science database using 
the following query: "fmri AND simulation AND (statistics OR 
data analysis)". By excluding articles labelled as reviews or 
proceedings, this search resulted in 318 hits (Result as of January, 
1st 2013). All these articles were manually inspected on content 
and relevance. This screening resulted in excluding articles based 
on the following criteria: the conducted simulations were for 
another modality (e.g. PET, EEG, MEG, ...); no time series were 
simulated (e.g. inference methods are often validated on simulated 
statistical maps); non-human fMRI was simulated; and no 
simulation study was conducted (e.g. papers presenting simulation 



PLOS ONE | www.plosone.org 



1 



July 2014 | Volume 9 | Issue 7 | e101953 



A Review of fMRI Simulation Studies 



software). After exclusion, the remaining 119 articles were taken 
into account in this analysis. Full bibliographic details of our 
sample can be found in the Supporting Information (Table SI). 
These articles were published in 39 peer-reviewed academic 
journals (Table 1) over a period of 16 years (Figure 1). In this 
sample, most simulation studies were published in Neurolmage 
(37), Human Brain Mapping (1 1), IEEE Transactions on Medical 
Imaging (10), Magnetic Resonance Imaging (7), IEEE Transaction 
on Biomedical Engineering (6) and the Journal of Magnetic 
Resonance Imaging (6). 

During our article selection, we focused on simulation studies 
conducted to validate or compare analysis procedures for BOLD- 
fMRI data. In order to perform this validation, a data generating 
process results in artificial data that reflect to some degree the 
characteristics of real measured fMRI data. From a statistical 
perspective, scanning parameters that influence magnetic proper- 
ties of the data (e.g. flip angle) are of less importance since they 
mainly have an effect on the signal-to-noise ratio. For instance, 
when these scanning parameters are optimised, the baseline signal 
might increase while the noise level decreases. The crucial aspect is 
to determine the components in the data that are expected to have 
an effect on the data analysis and model these components while 
generating the simulated fMRI data. 

Article evaluation 

In the present study, we analysed the sections describing the 
simulation study for the selected papers. Where necessary the 
appendices or supplementary materials were also included and 
whenever there was still missing information after screening these 
sections, the whole paper was searched for this information. Only 
the reported methodology was evaluated (i.e. no authors were 
contacted for more information). There might be a discrepancy 
between the conducted and reported simulation studies (e.g. not all 
details are mentioned), however, to ensure reproducible science all 
critical elements should be reported. It may not always be feasible 
to report everything in the main text, but academic journals allow 
for crucial content to be described in appendices or through online 
supplementary materials. For each study we evaluated the goal of 
the simulation study, the simulation parameters and the data 
generating process. In the case that multiple simulation studies 
were present in the article, this information was retrieved from the 
most complex case that was described. In the Results section, 
summarised results are presented. For a detailed results list on the 
individual study level, the reader is referred to Table 2. 



Table 1. Overview of journals in the survey. Full details of the 
included studies can be found in the supporting information 
(Table SI). 





Journal title 


Number of articles 


Neurolmage 


37 


Human Brain Mapping 


11 


IEEE Transactions on Medical Imaging 


10 


Magnetic Resonance Imaging 


7 


IEEE Transactions on Biomedical Engineering 


6 


Journal of Magnetic Resonance Imaging 


6 


Magnetic Resonance in Medicine 


4 


Other 


38 



doi:1 0.1 371 /journal.pone.01 01 953.t001 



Results 

Study goals 

Simulation studies are conducted to evaluate statistical models 
based on a given experimental design. For each article we assessed 
which statistical technique was validated. Six categories of 
statistical models were distinguished (see Figure 2, left panel). 
Most simulation studies are conducted for signal decomposition 
models like Principal Components Analysis (PCA), Independent 
Component Analysis (ICA) and Wavelet analysis. This group of 
methods is closely followed by General Linear Model (GLM) 
analysis, Likelihood Ratio Tests (LRT) and Mests. 11.8% of the 
simulation studies investigate properties of classification techniques 
using for example Support Vector Machines or cluster analysis. 
Methods that are less represented in our sample are connectivity 
analyses or preprocessing methods like motion correction and 
spatial smoothing. All studies that did not use any of the previous 
methods were gathered in a "rest" category. In this category are 
included, for example, HRF estimation methods, spatio-temporal 
models, bootstrapping and nonparametric techniques. 

Experimental design 

The methods described above are validated using a given 
experimental design (Table 3a). The majority of simulation studies 
report using a block design for the generation of the BOLD 
activity. When this design is not used, modelled activation is based 
on an event-related design or it concerns a resting-state study. 

Simulation parameters 

The general goal of a simulation study is to research a certain 
outcome (e.g. power, bias, ...) under several conditions (e.g. noise 
level, HRF variability, ...). The most common method to achieve 
this goal is by conducting a Monte Carlo experiment. The 
simulation reports in our database were evaluated on the 
dimensions of the simulated data, the number of replications 
and parameter variation. 

Data dimensions. fMRI data have in essence four dimen- 
sions (i.e. coordinates in an xyz-space and time). However, the 
majority of articles in our sample published results for 3D data 
where time series are simulated for all voxels in a single slice 
(Table 3b), while one fifth considered full 4D fMRI data. On the 
other hand, many of the articles reported simulating fMRI time 
series only with no spatial context. In this case, mass-univariate 
techniques were mostly evaluated that also regard fMRI data as 
being multiple measurements of single time series. A very small 
proportion considered two-dimensional data. This was reported 
exclusively in an ICA validation context, where the fMRI data are 
organised as voxels x timepoints. 

Replications. The overall majority of the selected articles 
considered single-subject data, while the remaining articles 
simulated data for multiple subjects. In these last studies, the 
number of subjects that was simulated corresponded typically with 
sample sizes reported in real fMRI studies (e.g. 4 to 20 subjects) 
and the data for these subjects were mostly simulated once (with a 
few notable exceptions, see Figure 3, right panel). For the single- 
subject simulation studies, the number of repetitions was higher in 
the majority of the studies, while about one third of the articles 
reported 1 replication of the simulated data for each setting of the 
manipulated parameters. It should be noted that simulating 3D or 
4D datasets without any spatial correlations is equal to the 
simulation of fMRI time series with n replications where n is the 
number of voxels. This was true for 22 of the 37 studies that 
reported using 1 replication. However, for the remaining studies 
conclusions are based on 1 realisation of the data. Two studies 
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Figure 1. Overview of number of articles for each publication 
year included in the survey. 

doi:10.1371/journal.pone.0101953.g001 

reported simulating time series just once for each setting of the 
simulation parameters. 

Parameter variation. Other possible parameters taken into 
account in the simulations were, for example, strength of the 
modelled activation, number of time points, noise level, repetition 
time (TR), etc. The relevance of a simulation study is highly 
dependant on the representativeness of these chosen parameter 
values. To ensure that the parameters are characteristic for fMRI 
data, it is recommended that a range of values is evaluated. 
Additionally, a justification is expected on why specific values of 
certain parameters are chosen. In our sample both requirements 
were assessed (see Table 4 for an overview). A study was classified 
as using varying parameters as soon as more than one value of a 
specific parameter was considered. Whenever a reason for 
choosing a specific parameter value was reported, the simulation 
study was evaluated as positive on the justification of the chosen 
parameters. About one third of the studies reported a variation in 
the values and gave a justification for their choices. Frequendy 
reported variations were several noise levels and activation 
strengths that were taken into account. As for the choices of the 
values, authors mostiy justified these as being realistic values in real 
fMRI data or being estimated from real data. However, about one 
third of the studies reporting variation of the parameters did not 
give any justification, ten percent did justify the choice of the 
parameter values but only used one specific value for each 
parameter, while one fifth of the studies in our sample did neither. 

Data generation models 

Of all simulation studies investigated, 84% were pure synthetic 
simulations while the other 16% adopted a hybrid simulation 
strategy. In hybrid simulations, a resting-state dataset is acquired 
and synthetically generated activation is added to these data. As 
such, knowledge of the ground truth is assured while the noise is 
representative for real data. However, manipulation of the noise in 
the simulated fMRI data is not possible and replicating the data 
will be a costly process. Therefore, in most simulations the fMRI 
data are generated completely artificially. 

All synthetic simulation studies adopted an additive data 
generation model (e.g. [8]) in which three main components can 
be distinguished: (1) a baseline signal, (2) BOLD activation and (3) 
noise. However, half of the studies did not report using a baseline 
for the data, so we could assume that this is zero for these studies. 
For the other half, 47% used a static baseline, for example a 



constant when simulating time series and a template slice or 
volume that was repeated for each time point in the case of 
simulating 3D or 4D fMRI data. A minority of the studies (3%) 
used a varying baseline, meaning that the baseline values were 
varied over time, e.g. to model thermal shifts [9]. 

BOLD response. An important component in the simulated 
fMRI data is the BOLD response because this signal defines the 
ground truth in the simulation studies. Despite the fact that the 
coupling between the neural activation and the BOLD response is 
still not completely understood [6] , several models are available to 
generate a haemodynamic response function (HRF). See Figure 4 
(left) for an overview of the models used in the selected articles. 
Those methods are, for example, a gamma function [10,11], a 
difference of two gamma functions, also known as the canonical 
HRF [12,13] or the BaUoon model [14,15]. The different shapes 
of these models are illustrated in Figure 4 (right). Nevertheless, one 
third of the reported simulation studies disregarded any BOLD 
characteristics and chose a square wave (i.e. a boxcar function) to 
represent the BOLD activation in the simulated fMRI data. When 
no experimental task was simulated, resting-state activation was 
predominantly modelled as a set of sinusoidal functions, although 
a few of the selected studies did not simulate any BOLD activation. 
The shape of the HRF varies immensely from brain region to 
brain region and also from subject to subject. One fifth of the 
simulation studies reported modelling this variation in the HRF 
parameters, while the majority considered a fixed HRF in all 
simulations (Table 4b). 

Noise model. Noise is not only characteristic for fMRI data 
but also ensures generalisability of the conclusions based on 
simulations. All simulation studies incorporated some noise 
generating process (see Figure 5, left panel, for an overview). 
The vast majority of the synthetic simulation studies (i.e. 75%) 
selected the noise randomly from a Gaussian distribution. An 
additional 9% added also some drift function to this noise, while 
about 7% of the studies considered a skewed noise distribution 
(e.g. Rician or super Gaussian distribution). The remainder of the 
studies used a very specific noise model (for example by adding 
physiological noise, using a uniform distribution or adding motion 
correlated noise), because they focused on the effects of these noise 
sources. fMRI noise is also known to be spatially and temporally 
correlated. However, the majority of the selected articles did not 
report modelling any correlations in the noise (Table 3c). 
Temporal correlation was almost exclusively modelled as an 
auto-regressive autocorrelation process. Typically this process was 
of order 1 , but there are exceptions that used a model order of 3 or 
4. Spatial correlations were typically created by spatial smoothing 
of the generated noise. A small fraction of the studies modeled 
both spatial and temporal correlations. 

Discussion 

Whenever statistical models are validated based on simulations, 
the model that is used for the data generation is of utmost 
importance. In this paper, a survey was conducted to list currendy 
used data generation models. Based on 1 1 9 research articles we 
described the simulation type, use and justification of simulation 
parameters and the different components in the fMRI data 
generating process. The survey results showed that current fMRI 
simulation studies sometimes lack a thorough experimental 
manipulation. The parameters in the simulation study (e.g. noise 
level, TR, HRF delay, etc.) are not always varied, while 
representative values of some of these parameters are not known. 
Using the results from iEEG could guide many of these parameter 
choices and make simulation more realistic in general. Further, the 
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Other 
Preprocessing 
Connectivity 
Classification/Cluster Analysis 
GLM/LRT/t-test 
ICA/PCA/Wavelets 




0 10 20 30 



Number of articles 

Figure 2. Statistical models investigated in the selected 
articles. 

doi:10.1371/journal.pone.0101953.g002 

number of replications is a major topic of concern. We observed 
that the conclusions of some of the simulation studies were based 
on only one replication of the random data generating process. 
When a simulation study is used to evaluate the expectation of 
random variables, the external validity of the study is threatened if 
only a few replications of the data generating process are used. 

Model-based versus data-based simulation 

In about 60% of the synthetic simulation studies, the fMRI data 
were generated based on a model similar to the model being 
validated (e.g. generating time series from a VAR model to 
evaluate Granger causality). As such, the simulation is entirely 
model-based and the assumptions of the model under investigation 
are completely met. Consequendy, the conclusions of these 
simulation studies give only partial information on the applicability 
of these models as an analysis tool for fMRI data, since fMRI data 
generally do not meet the assumptions of most statistical models. A 
better practice would be to start from the data themselves and to 
define a data generating process that models the different sources 
that are present in fMRI data. By using data-based simulations, 
the properties of the analysis techniques can be assessed in more 
realistic circumstances. 

In this context, it should be noted that the data generating 
process used in most current simulation studies is not compatible 
with the knowledge on how fMRI data are constructed. For 



Table 3. Proportions of (a) experimental designs, 
dimensions of the simulated data, and (c) the use 
correlated noise reported in the selected articles. 


(b) 
of 




a. Experimental designs 


Block Event-related 


Resting-state 






58.0% 21.8% 


20.2% 






b. Data dimensions 


ID 2D 


3D 


4D 




28.6% 1 .7% 


48.7% 


21.0% 




c. Noise correlations 


None Temporal 


Spatial 


Both 




58.0% 24.0% 


1 3.0% 


5.0% 





doi:1 0.1 371 /journal.pone.01 01 953.t003 



o 




1 2-20 25-100 200-800 1 000 2000-1E+6 NA 



Number of repetitions 

Figure 3. Overview of the number of replications for single- 
subject and multi-subject simulations. 

doi:1 0.1 371/journal.pone.01 01 953.g003 

instance, it is well-known that the BOLD response is the result of a 
haemodynamic coupling to neural activity. Although the precise 
dynamics are perhaps still debatable, there is consensus about the 
BOLD signal being a delayed response with varying dynamics 
over the brain regions and between subjects. However, about one 
third of the reported simulation studies in our database did not 
model any of these characteristics and used a simple boxcar 
function to distinguish stimulus induced activation from rest 
(Figure 4). About the same number did model the slow emergence 
of the BOLD signal by using a canonical HRF, but only a small 
fraction (i.e. two studies) did also model BOLD nonlinearities by 
means of the Balloon model. In the case of spontaneous neural 
activation (for example in resting-state studies), BOLD fluctuations 
were mostly modelled through sinusoidal functions with frequen- 
cies that are commonly observed in resting-state studies. However, 
describing these spontaneous fluctuations by sinusoids stems from 
the tradition to use ICA to analyse these data and is again more 
compatible with the model under investigation than being 
representative for the data. Further, variability of the BOLD 
response was taken into account only in about one fifth of the 
simulation studies (Table 4b). With regard to modelling BOLD 
activation, in a data-based simulation context at least some form of 
HRF should be used that takes into account the basic character- 
istics of the BOLD signal, while any variation of the parameters of 
this model will enhance the generalisability of the simulation 
results. 



Table 4. Proportions of studies reporting (a) parameter 
variation and justification of the chosen parameter values and 
(b) whether HRF variability was taken into account. 





a. Parameter variation and justification 




Justification 


of value 


Parameter variation 


No 


Yes 


No 


20.2% 


10.9% 


Yes 


32.8% 


36.1% 


b. HRF variation 




No 


Yes 




77.3% 


22.7% 



doi:1 0.1 371 /joumal.pone.01 01 953.t004 
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Sqi 




0 10 20 30 40 50 60 

Number of articles 



Figure 4. Overview of the different HRF functions used in the simulation studies (left) and illustration of the BOLD response shapes 
as the result of a block design fMRI experiment for the different HRF models (right, source: [24]). 

doi:1 0.1 371 /journal.pone.01 01 953.g004 



The generation of fMRI noise alsocauses a discrepancy between 
simulated and real fMRI data. The noise in fMRI consists of 
several sources [7,16], for example thermal noise, motion related 
noise, physiological noise and task-related noise. Nevertheless, the 
vast majority of simulation studies investigated here have only used 
a white Gaussian noise model to generate fMRI noise ignoring its 
multiple-source character. In some cases, spatial or temporal 
correlations are added. Again, this noise model is consistent with 
many of the statistical models for fMRI data (e.g. GLM). 
Unfortunately, the Gaussian noise model only accounts for a 
fraction of the noise in real data. One solution is to use hybrid 
simulations in which using real noise acquired in a resting-state 
study increases the realistic character of the simulated data. 
However, it is impossible to manipulate noise related parameters 
and unwanted activation in resting-state data can influence the 
simulation results. Moreover, multiple replications (i.e. acquiring 
resting-state data from multiple subjects) are costiy. Perhaps the 
better solution is to model more than only Gaussian noise (i.e. 
thermal noise) and also include, as has been demonstrated in 
several simulation studies, motion noise, physiological noise, signal 
drift, etc. In some simulation studies, the results will not be altered 
under a full noise model. It may not always be necessary to include 
all noise sources (e.g. if a certain noise source is removed or the 
influence of a source is assumed to be equal in all conditions), but 
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Figure 5. Overview of the noise models in the synthetic 
simulation studies. 
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this should be motivated at least. To assure generalisability of the 
simulation results, a more complex noise model, compared to the 
one that is generally adopted now, might be imperative. 

Recommendations for simulating fMRI data 

Based on these results we present some recommendations to 
improve the reliability and generalisability of fMRI simulation 
studies. 

1 . All parameters for which a value is chosen in the simulation 
experiments should be thoroughly justified. If a single value is 
not agreed upon, a range of values should be evaluated (see 
[8,17-19] for some examples). 

2. The conditions in the simulation study, (e.g. statistical model, 
parameter values,...), have to be combined in an experimental 
design. The construction of this experimental design is essential 
[20]. Factors that can be considered in the experiment are, for 
example, variations of parameter levels, analysis methods and 
number of replications. The most complete design is the full- 
factorial design, although there might be reasons to adopt 
fractional designs. Based on the experimental design, the 
simulation experiment will have external validity (i.e. the results 
can be generalised beyond a given experiment). 

3. A Monte Carlo experiment has to be repeated to exclude 
random influences on the simulation results. Therefore, a 
sufficient number of replications of the experiment has to be 
performed. In the case of time series simulations, at least 10000 
replications might be necessary, while for the simulation of 3D 
or 4D fMRI data a total of 100 might be enough. In general, 
the more replications, the better. For example, [19] generated 
10000 replications of 3D datasets, and [17] simulated 4D 
multi-subject datasets to represent twin data using 500 
replications of each paired dataset. In practice, this number 
can be limited due to time or computational constraints. When 
in doubt, the convergence of the results should be tested. 

4. The simulated task-related activation signal should reflect 
known properties of the BOLD response. This includes, but is 
not limited to, response delay, nonlinearities and inter-region 
and -subject variability. Either the canonical HRF or the 
Balloon model can be used (see [21] for an example using the 
Balloon model). 

5. fMRI noise is partially white (i.e. system noise) and this part 
can be modelled by random Gaussian noise. However, in 
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addition one should account for (residual) motions, heart rate 
and respiratory rate fluctuations, task-related noise and spatial 
and temporal correlations (see, for example, [8,22,23]). 
6. If either the BOLD model or the noise model is simplified, this 
should be duly motivated. 

Conclusion 

The use of simulation studies to validate statistical techniques 
for fMRI data should be highly encouraged, because simulation 
experiments are a fast and cheap tool to assess the quality and 
applicability of the analysis techniques. However, our survey of the 
fMRI simulation literature raised several concerns with respect to 
simulation studies as they are conducted now. The observed 
decrease in the number of fMRI simulation studies in recent years 
is troubling. Furthermore, it was demonstrated that the data 
generating process used to simulate fMRI data is often model- 
based and parameter variation in the data generating process is 
not implemented in a standard manner. 

A possible reason for the absence of a common fMRI data 
generation model might be the lack of established software 
packages. Current simulation studies are mainly conducted using 
in-house software routines that have no common programming 
language and are not widely available. Recently, developments to 
fill this gap have resulted in the release of software packages that 
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aged to consider the recommendations presented in this paper in 
order to increase the reliability and generalisability of the 
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